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Abstract 

The string splicing was introduced by Tom Head which stands as an 
abstract model for the DNA recombination under the influence of re- 
striction enzymes. The complex chemical process of three dimensional 
molecules in three dimensional space can be modeled using graphs. The 
graph splicing systems which were studied so far, can only be applied to a 
particular type of graphs which could be interpreted as linear or circular 
graphs. In this paper, we take a different and a novel approach to splice 
two graphs and introduce a splicing system for graphs that can be applied 
to all types of graphs. Splicing two graphs can be thought of as a new 
operation, among the graphs, that generates many new graphs from the 
given two graphs. Taking a different line of thinking, some of the graph 
theoretical results of the splicing are studied. 

1 Introduction 

To understand and analyze well the complex structure as well as the evolution- 
ary process of genes, researchers have long been searching for syntactical models. 
One such model was a grammatical model provided by formal language theory 
PQ. Yet, the grammar types in the Chomsky hierarchy was inadequate in de- 
scribing the biological systems [2]. 

In his pioneering work, Tom Head has proposed an operation called 'Splicing' 
for describing the recombinant behavior of double-stranded DNA molecules [3] 
which established a new relationship between formal language theory and the 
study of informational macromolecules. Splicing operation is a formal model 
of the recombinant behavior of DNA molecules under the influence of restric- 
tion enzymes and ligases. Informally, splicing two strings means to cut them at 
points specified by the given substrings (corresponding to patterns recognized 
by restriction enzymes) and to concatenate the obtained fragments crosswise 
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(this corresponds to the ligation reaction). Since then, the theory of splic- 
ing has become an interesting area of formal language theory, where results of 
splicing systems on string languages (splicing systems were later renamed as 
H-systems to indicate the originator) gave new insights in some (closure) prop- 
erties of families of string languages [I] . The mathematical study of the splicing 
operation on the strings has been investigated exhaustively, which lead to a 
language generating device viz., Extended H-systems (EH-systems) using the 
'splicing operation' as the basic ingredient [5]. Several control mechanisms were 
suggested in increasing the computing power of EH systems with finite compo- 
nents, equivalent to the power of Turing machines. Thus, splicing operation on 
strings has lead to universal computing device (programmable DNA Computers 
based on splicing). 

A splicing operation contains splicing rules of the form (ui, u%\ U3, U4), where 
Ui, U2, 113, U4 are strings over some alphabet V. We apply the splicing rule to two 
strings X1U1U2X2, 2/1M3M4J/2, ( %i,X2,yi,y2 are strings over V*). As a result, the 
new strings X\U\u^y2 and y\u-sU2X2 are obtained. We use the modified definition 
of splicing as it appears in [1] 

DNA sequences are three dimensional objects in a three-dimensional space. 
Some problems arise when they are described by one-dimensional strings. So, 
the other models of splicing were explored. In [HI M ED] > array splicing sys- 
tems were studied. In [B],[IH] graph splicing systems were discussed. But these 
systems cannot be applied to the graphs that cannot be interpreted as linear 
or circular graphs. Hence, we take a different approach to splicing two graphs 
and introduce a splicing system for graphs which can be applied to all graphs. 
Splicing two graphs can be thought of as a new operation among the graphs, 
that generates new graphs from the given two graphs. 

Hence, in this article, the following section discusses the cutting rules, which is 
the basic component for the proposed graph splicing system. Section 3 deals 
with the graph splicing system with illustrations. The section 4 studies some 
graph theoretical properties of this system. The last section concludes with the 
directions for the future research in this graph splicing system. 

2 Definitions 

We follow the terminologies and the basic notions of graph theory as in [T5] and 
the terminologies of formal language theory as in [T3] . 

For any finite alphabet S, a labeled graph G over V is a triple G = (V,E,L) 
where V is the finite set of vertices(or nodes), E is finite set of edges of the form 
(n, to), n, m € V, n ^ m, where each edge is an unordered pair of vertices and L 
is a function from V to S. An edge (n, m) means that one end-point of the edge 
is the vertex n and the other end-point is the vertex to. Edge set of G is written 
as E{G) and the vertex set of G by V(G). The number of vertices of a graph 
is called the order of the graph and the number of edges of the graph is called 
the size of the graph. We consider only simple graphs where repeated edges 
(multiple edges) with same end-points and edges with both end-points same 
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(loops) are not allowed. The graph G = (V, E, <fi) refers to an unlabeled graph. 
We denote an unlabeled graph just as (V, E), instead of (V,E,(f)). Whenever 
a graph is considered, we mean only a simple unlabeled graph. We mention 
accordingly, when we consider the graphs other than the above one. 

Definition 1 A graph G is said to be in Pseudo-Linear Form (PLF) if the 
ordered vertices are positioned as per the order, as if they lie along a line and 
the edges of the graph drawn accordingly. 

Ordering of the vertices can be done in any way. For a particular ordering, 
the adjacency matrix of G and the adjacency matrix of G in PLF, remain the 
same. In a graph, the vertices could be positioned at any place and the edges of 
the graph drawn accordingly. For the graph in PLF, vertices are first ordered 
and positioned as if they lie on a line. This line may be a horizontal line or a 
vertical line or any inclined line. In case, the line is horizontal, we can position 
the ordered vertices either from left to right or from right to left. So, with out 
loosing any generality, we position the vertices from left to right as if the vertices 
lie on a horizontal line. Once a graph is in PLF, we name the vertices with a 
positive integer that represent their order in the ordering. If a vertex is second 
in an ordering, we name that vertex as 2. So, the vertex set of G in PLF is 
{1, 2, 3 . . . | V }. Given an ordering of the vertices, any graph can be redrawn 
in the PL form. For example, if the vertices of the graph 
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A graph in PLF will look like a path graph with edges going above or below 
the linear path. The graph P n with vertices written horizontally, is a graph in 
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PLF. From now onwards, unless otherwise mentioned, we mean a graph as the 
one in PLF for some ordering of the elements of V. 

Definition 2 A cutting rule C for a graph G — (V,E) in PLF is a pair EL 
where i and j are positive integers < i < j < \V\, \j — i\ < 1. 

By the condition | j — i |< 1, we mean that the the vertices i and j may be 
successive vertices (ordered successively) or both vertices i and j are the same. 
A cutting rule C — is called as a reflexive cutting rule if i = j. 

Definition 3 The left-degree, ldc{v) of a vertex v £ V(G) is the number of 
edges of G to the left of the vertex v that are incident with v. The right-degree, 
rdc(v) of a vertex v € G is the number of edges of G to the right of the vertex 
v. The degree of v, d(v), the number of edges that are incident with v, is the 
sum of the left-degree and the right-degree of v. 

Definition 4 Let Vi(v) be the set of all vertices that lie to the left of the vertex 
v (v is not in Vi(v)). Similarly, V r (v) is the set of all vertices that lie to the 
right of the vertex v. 

Scheme of cutting 

A graph is cut into two parts by cutting some of its edges. The cutting rule 
cuts a graph G between the vertex i and the vertex j (if for some reasons, 
the vertices are named with symbols other than the positive integers, the cutting 
rule cuts between the vertex that comes in the i th position in the ordering and 
with the vertex in the j th position). The work of the cutting rule [i,j] over G 
is to cut the edge (i,j) and the edges that go above as well as below the edge 
i.e., The cutting rule cuts the following edges (if they exist in the 
graph G). 

1. The edge (i,j) 

2. the edges (i,v),v £ V r (j) 

3. The edges {v,j),v £ Vi(j) 

4. The edges (u,v),u £ Vi(i),v £ V r (j) 

The reflexive cutting rule (i, i) cuts the vertex i and all the edges that go above 
as well as below the vertex i. i.e., the reflexive cutting rule i cuts the following. 

f . The vertex i 

2. The edges (u,v),u £ Vi(i),v £ V r (j) 

1 for the cutting rule [i,j], we use the square braces and for the edges (ij), we use the 
parenthesis 



4 



When an edge (i, j) is cut into two parts, we call the the two parts of the edge as 
hanging-edges or free-edges. Similarly, when a vertex is cut, we call that vertex 
as a hanging-vertex or a free-vertex. If an edge (i, j) is cut, we write the left 
part of the edge as (indicating that the free-end is the right end) and the 
right part of the edge as [i, j) (indicating that the free-end is the left end). The 
edges and [i,j) are drawn as illustrated with a x at their free ends. If a 
vertex v is cut, the left part of the vertex is written as v] and the right part is 
written as [v. [v] indicates just that the vertex v is cut. 
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Edge(ij) 
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Edge(i,j] Edge[i,j) 



The set ECUTq(C) represents the set of all edges of G that got cut by the 
cutting rule C and the | ECUTq{C) | (the cardinality of the set) is the power 
of the cutting rule C with respect to the graph G. Power of a cutting rule with 
respect to G indicates the number of edges that got cut by that cutting rule 
in G. The set VCUTq{C) represents the set of all vertices that got cut by the 
vertex v. Only for the reflexive cutting rules, the set VCUTq(C) will exist and 
for all the other cutting rules, this set is (f>. Since any reflexive cutting rule 
can cut only one vertex, the set VCUTc(C) is always singleton. For a reflexive 
cutting rule, the set ECUTq(C) can be (f> ( means that no edge is going above 
or below the vertex i in the graph G ). 

When a graph G is cut into two by a cutting rule C, the left part of the graph 
is called as Prefix(G) and the right part is called as Suf fix(G). Obviously, 
ECUT G { [i, j] ) = ECUT G ( [i , i] U ECUT G ( \j, j] U { (i , j) } . 

We illustrate the cutting of the graph K 5 , a complete graph with five vertices 
using the cutting rule [2, 3]. 
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Graph K. | cut by the cutting rule [2,3] 
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FrcfbttKL ) Suffix(KL 5 ) 

The prefix(K 5 ) and Suffix(K 5 ) are also graphs with the vertex set 

V(Prefix(K 5 ) = {1,2} 

and with the edge set 

E(Prefix(K 5 ) = {(1, 2), (1, 3], (1, 4], (1, 5], (2, 3], (2, 4], (2, 5]}. 

Similarly, V(Suffix(K 5 ) = {3,4,5} and E(Suffix(K 5 ) = {[2, 3), [2, 4), [2, 5), 

[1.3) , [1,4), [1,5), (3,4), (3,5)}. ECUT K5 ([2,3\) = {(1,3), (1,4), (1,5), (2,3), 

(2. 4) , (2, 5)}. VCUT K ,{[2A) = 

3 Graph Splicing System 

Definition 5 A splicing rule S = {C\,C-i), is a pair of cutting rules. 

Given two graphs G,H and a splicing rule S = (Ci,C2), the first graph G is 
cut as specified by C\ and the second graph H is cut as specified by Ci. As a 
result we get the four cut-graphs viz., Prefix(G),Suffix(G),Prefix(H) and 
Suffix(H). 

Mode of recombination 

Definition 6 Prefix(G) (or Prefix(H)) recombines with the Suf fix(H) (or 
Suffix(G)) if and only if \ ECUT G (d) | =| ECUT H (C 2 ) | and | VCUT G (d) 
=| VCUTh{C2) I- In other words, for a recombination, the number of hanging- 
edges in Prefix{G) (or Prefix(H)) should be the same as that of the number 
of hanging- edges in Suf fix(H) (or Suf fix(G)) and the the number of hanging- 
vertices in Prefix(G) (or Prefix(H) ) should be the same as that of the number 
of hanging-vertices in Suf fix(H)(or Suf fix(G) ). 
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The above definition tells that for a splicing process to end up in a recombina- 
tion, the power of both the cutting rules present in the splicing rule S should be 
the same. We assign a positive integer, called the power of the splicing rule, to 
every splicing rule S = {C\,C2) if and only if the powers of C\ and Ci are the 
same and the common value is the power of the splicing rule S. Further, if one 
cutting rule in S is reflexive, the other should also be reflexive. 

Definition 7 Every hanging-edge of the Prefix(G) (or Prefix(H)) recom- 
bines (or joins) with only one hanging-edge of the Suf fix(H) (or Suf fix(G)), 
and every hanging-edge of Suf fix(H) (or Suf fix(G)) has the recombination 
with only one hanging-edge of Prefix(G) (or Prefix(H)). The hanging-vertex 
(if available) of the Prefix(G) (or Prefix(H)) recombines with the hanging- 
vertex of Suf fix(H) ( or Suf fix(G) ). 

Thus, Prefix(G) recombines with Suf fix(H) to generate new graphs. After 
the recombination, we order the vertices of the new graph (this will be in PL 
form) in the same sequence as it appears and name them accordingly. New 
graphs are generated because of the recombination of the edges that are cut. If 
there are more than one hanging-edges in both Prefix(G) and Suffix(H), 
the hanging-edges of the Prefix(G) can recombine with the hanging-edges 
of Suffix(H) in more than one way. If there are m hanging-edges in both 
Prefix(G) and Suffix(H), the hanging-edges can recombine in to! ways, gen- 
erating to! new graphs. In other words, the number of such recombinations 
is just the number of bijective mappings from the set ECUTg(Ci) to the set 
EC\JTh(Ci)- When the Prefix(H) recombines with the Suffix(G), the same 
number of to! will be generated. Thus, the splicing of two graphs G and H 
using a splicing rule of order m, generates 2(to!) new graphs. 
Thus, splicing process comprises of cutting as well as the recombination. If the 
splicing of G and H using S generates a new graph F by the recombination of 
the PrefixiG) with the Suffix(H), we denote that by G\-gH = F (indicat- 
ing that F is the first splicing product). Similarly, G\- 2 S H — F indicates that 
F is generated by the recombination of the Prefix(H) with the Suf fix(G) 
(indicating that this F is the second product of splicing). Just GhgH = F 
indicates that F may be either the first splicing product or the second splicing 
product. The splicing scheme(process) is denoted by a. For a splicing process, 
one requires two graphs and a splicing rule. The set of all graphs generated by 
splicing G and H using the splicing rule S is denoted by c({G, H}, S). Similarly, 
<7i({G, H},S), o-2({G,H},S) are meant accordingly. 

Definition 8 The Graph Splicing System v = (21, 6), where 

21 A finite set of simple, unlabeled graphs, called the set of axioms. 

& A finite set of splicing rules. 

The underlying splicing scheme is c({G, H},S), G,H €21,5 € 6. 

The set of all graphs generated by splicing all pairs of the graphs of 21 with all 
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splicing rules of & (The graph language of the splicing system v), 



L(v) = er(2t) = |J a({G,H},S) 



G,H&%,see 



In the DNA recombination, when some restriction enzymes and a ligase are 
present in a test tube, they do not stop after one cut and paste operation, 
but they act iteratively. The products of a splicing again take part in the 
splicing process. For an iterative splicing among the graphs, the axiom set 
should contain many copies of the same element. Ordinary sets are composed 
of pairwise different elements, i.e., no two elements are the same. If we relax 
this condition, i.e., if we allow multiple but finite occurrences of any element, 
we get a generalization of the notion of a set which is called a multiset. We 
assume that our axiom set is a multiset. That means infinitely many copies of 
the elements of the axiom set will be present in the set, which facilitates the 
elements to take part in the splicing process iteratively. Even the product of a 
splicing process will also be available infinite number of times. To make a graph 
splicing system into an iterated graph splicing system, the only requirement is 
to make the axiom set 21 into a multiset such that infinitely many copies of the 
elements of 21 are in 21. 

Definition 9 The graph language of an iterative graph splicing system v = 
(21,6), where % is a multiset such that infinitely many copies of the elements 
o/2l are in 2( ; is defined as L(i>) = er*(2l) where 



Example 1 Consider the graph splicing system v = ({C3, C 4 }, {([1, 2], [2, 3])}). 
C3 and C 4 are the cycle graphs of order 3 and 4 respectively. 



L{u) = a({C 3 , C 4 }, S) U a{{C A , C 3 }, S) U a({C 3 , C 3 }, S) U a({C 4 , C 4 }, S) 



where S is the splicing rule ([I, 2], [2, 3]). The power of the splicing rule is 2. In 
each splicing process, 2(2!) = 4 new graphs will be generated. So, L{v) will have 
a total of 16 new graphs. Of these, some of the graphs are isomorphic to each 
other. It is found that the non- isomorphic graphs in L{y) are C 3 ,Ci,C^ and a 
graph G, where G = ({I, 2}, {(I, 2), (I, 2)}). The above graph G is not a simple 
graph ( it has a multiple edge between the vertices 1 and 2 ). This makes us to 
conclude that the splicing of two simple graphs need not be simple. 



a°(2l) 
a 4+1 (2l) 

<7*(2t) 



21, 

cr'(2l) UoV(2t), 
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4 Properties 



Proposition 1 Given a graph G, the power of the cutting rule [i,j] with respect 
to the graph G is 

rd(i) - ld(i) + ^ (rd(v) - ld(v)). 

vev t (i) 

Proof: Let G be the given graph. We count the total the number of edges 
in G that got cut by the cutting rule, which is the power of the cutting rule. 
We classify the proof into two cases based on the existence of the edge (i, j) in 
G or not. 

Case(i) : £ E(G). 

We know that the cutting rule [i,j] cuts the following edges. 

1. The edge 

2. The edges (i,v),v £ V r (j) 

3. The edges (v,j),v £ Vi(j) 

4. The edges (u,v),u £ Vi(i),v £ V r (j) 
The expression 

rd(i) + ld(j) - 1 (1) 

brings out the number of edges which fall under (1),(2) and (3) in the list above. 
Since the edge (i,j) is counted in both rd(i) as well as in ld(j), we subtract one 
from the expression. 

Let A be the set of edges whose left end is Vi(i). Let B c A, be the set of edges 
of A whose right end is in Vi(i). i.e., both the ends of edges in B are in Vi(i). 
Let C C A be the set of edges of A whose right end is i i.e., for the edges in 
C, one end is V;(i)and the other end is i. Let D C A, be the set of edges of A 
whose right end is j. i.e., for the edges in D one end is in Vi(i) and the other 
end is j. Let E be the set of edges of A whose right end is in V r (j) i.e., the set 
of edges whose left end is in Vi(i) and the right end is in V r (j) Obviously, the 
set of edges which come in (4) in the above list, will be E. 

\A\= rd(v);\B\= ]T ld(v); \ C \= ld(i); \ D \= ld(j) - 1. 

v£Vi(i) veV,(i) 

Since the edge would be counted in ld(j), we subtract one from ld(j). 
Number of edges that come under (4) is 

\E\ = \A\-\B\-\C\-\D\ = rd ( v )- E ld(v)-ld(i)-(ld(j)-l) 

vev t (i) vev t (i) 

(2) 
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Hence, the total number of edges cut by 

= (1) + (2) = rd(i) - ld(i) + J2 M w ) ~ ld ( v ^ 

vev t (i) 

Case(ii) : not in E{G) 

We proceed similarly as the case(i). The number of edges that come under 
(1),(2) and (3) is 

rd(i) + ld(j) 
Number of edges that come under (4) is 

veVi(i) vev,(i) 
Hence, the total number of edges cut by 

= rd(i) - ld{i) + M w ) _ 

w€V|(i) 

In both the cases, we get the same expression. Hence the proof. 

Theorem 1 In any graph G, the sum of the differences between the right degree 
and the left degree of all the vertices is zero. 

Proof: In the Proposition 1, in computing the power of a cutting rule 
we counted the number of edges whose one end is in Vi(i) and the other end is in 
V r (j) by deleting some edges from the set A which is the set of edges whose left 
end is in Vi(i). Instead, we can have the set A to be the set of edges whose right 
end is in V r (j) and proceed in an analogous way, as in the proof of Proposition 
1. We get the power of the cutting rule to be 

ld(j)-rd(j)+ ]T ld(v)-rd(v) 

vev r (j) 

which is a symmetric one with the expression got in Proposition 1. 

Since the power of a cutting rule is a constant with respect to a G, both the 

expressions should be equal. 

rd(i) - ld(i) + Y M u ) - ld ( y )) = ld (j) - rd C?) + H ( ld ( v ) - rd ( y )) 

veVi(i) v£V r (j) 

implies ^~](rd(v) — ld(v)) = or ^ (rd(v) — ld(v)) = 
vev vev 

corollary 1 The number of edges in a graph G is always 

rd(v) 

vev 
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vEV 

Proof 



we have 

E(^0) - rd(«)) = 0. 

This implies, 

Y,ld(v)=\E\=Y,rd(v) 

Remark 1 The above Corollary can also be proved in another way using the 
fact that every edge should contribute one to the left degree of some vertex and 
one to the right degree of some other vertex. 

For want of space,We state some of the results without proofs. 

Theorem 2 1. G h H = H hgR G , where S R is the splicing rule in which 
the cutting rules of S got swapped. 

2. G\~s H ^ H \~s G i.e., the splicing operation is not commutative. 

3. The splicing operation preserves the degrees of the vertices 

4- Regularity is preserved by the splicing, i.e., if we splice any two regular 
graphs, the splicing product is again a regular graph. 

5. maximum size of the splicing product of G and H will be the sum of the 
orders of G and H minus 1. 

6. For a complete graph K n , rd(i) = ld(n+l — i), for every vertex i € V(K n ). 

7. The set of all simple graphs is not closed with respect to the splicing oper- 
ation. 

Theorem 3 A graph G is said to contain a cycle if and only if there exists a 
sequence A of successive cutting rules El with power > 1 such that 

p| ECUT G ([i,i + l])^<l> 



2 The rules [i, i + 1] and [i + 1, i + 2] are termed successive cutting rules 
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Theorem 4 Let G and H be any two isomorphic graphs. Let G \~s H = F , for 
any splicing rule S. Then F is isomorphic to G (or H) if and only if the order 
of the graph F and the order of G (or the order of H) are the same. 

Theorem 5 If for a graph G, there exists only one cutting rule whose power is 
equal to the size of the graph G, Then G is bipartite. 

5 conclusion 

As graphs are better suited for representing complex structures, a model for 
splicing the graphs, graph splicing system is introduced, which can be applied 
to all types of graphs. Though the graph splicing is introduced as a new oper- 
ation among the graphs, studying the computational effectiveness of this graph 
splicing system is an important area to explore. One can introduce various pa- 
rameters like the number of graphs in the axiom, the number of splicing rules, 
power of the splicing rule etc., and finding the minimum value of the parameters 
for which the graph splicing system is still computationally complete. Besides, 
as a new line of thinking, a nice investigation to bring out the utility of the 
splicing in graph theory is worth. 
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